AITopics | training speed

Collaborating Authors

training speed

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Speedy Performance Estimation for Neural Architecture Search

Neural Information Processing SystemsApr-25-2026, 01:53:53 GMT

Reliable yet efficient evaluation of generalisation performance of a proposed architecture is crucial to the success of neural architecture search (NAS). Traditional approaches face a variety of limitations: training each architecture to completion is prohibitively expensive, early stopped validation accuracy may correlate poorly with fully trained performance, and model-based estimators require large training sets. We instead propose to estimate the final test performance based on a simple measure of training speed. Our estimator is theoretically motivated by the connection between generalisation and training speed, and is also inspired by the reformulation of a PAC-Bayes bound under the Bayesian setting. Our modelfree estimator is simple, efficient, and cheap to implement, and does not require hyperparameter-tuning or surrogate training before deployment. We demonstrate on various NAS search spaces that our estimator consistently outperforms other alternatives in achieving better correlation with the true test performance rankings. We further show that our estimator can be easily incorporated into both query-based and one-shot NAS methods to improve the speed or quality of the search.

artificial intelligence, bayesian inference, machine learning, (17 more...)

Neural Information Processing Systems

Country: Europe > United Kingdom > England (0.28)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Add feedback

123cfe7d8b7702ac97aaf4468fc05fa5-Supplemental-Conference.pdf

Neural Information Processing SystemsApr-25-2026, 01:52:47 GMT

artificial intelligence, machine learning, neighborhood, (18 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.73)

Add feedback

Optimal Stochastic and Online Learning with Individual Iterates

Yunwen Lei, Peng Yang, Ke Tang, Ding-Xuan Zhou

Neural Information Processing SystemsFeb-11-2026, 20:47:49 GMT

Neural Information Processing Systems http://nips.cc/

algorithm, algorithm 1, iterate, (14 more...)

Neural Information Processing Systems

Country:

Europe > Germany > Rhineland-Palatinate > Kaiserslautern (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)
North America > Canada (0.04)
(3 more...)

Industry: Education > Educational Setting > Online (0.51)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.50)

Add feedback

6d538a6e667960b168d3d947eb6207a6-Paper-Conference.pdf

Neural Information Processing SystemsFeb-9-2026, 16:05:30 GMT

Prior work tries to improve the sampling locality by enforcing all the training jobs loading the same dataset in the same order and pace. However, such a solution isonly efficient under strong constraints: alljobs are trained onthe same dataset with the same starting moment and training speed. In this paper, we propose a new data loading method for efficiently training parallel DNNs with much flexible constraints. Our method is still highly efficient when different training jobs use different but overlapped datasets and have different starting moments andtrainingspeeds.

artificial intelligence, deep learning, machine learning, (18 more...)

Neural Information Processing Systems

Country: North America > United States > Illinois (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.31)

Add feedback

We instead propose to estimate the final test performance based on a simple measure of training speed.

architecture, artificial intelligence, machine learning, (16 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Crowd

Neural Information Processing SystemsFeb-7-2026, 12:45:16 GMT

Eq. 6 in our paper is Lemma 2.1, pg 84. We will cite and clarify.

artificial intelligence, iteration, sinkhorn iteration, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.49)

Add feedback

Rethinking Memory and Communication Costs for Efficient Data Parallel Training of Large Language Models

Neural Information Processing SystemsFeb-5-2026, 16:21:13 GMT

Recently, various strategies for distributed training of large language models (LLMs) have been proposed.By categorizing them into basic strategies and composite strategies, we have discovered that existing basic strategies provide limited options in specific scenarios, leaving considerable room for optimization in training speed.In this paper, we rethink the impact of memory and communication costs on the training speed of LLMs, taking into account the impact of intra-and inter-group communication performance disparities, and then propose a new set of basic strategies named the \textbf{Pa}rtial \textbf{R}edundancy \textbf{O}ptimizer (PaRO).PaRO Data Parallelism (PaRO-DP) accelerates LLM training through refined model state partitioning and tailored training procedures.

artificial intelligence, large language model, natural language, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

MEST: Accurate and Fast Memory-Economic Sparse Training Framework on the Edge

Neural Information Processing SystemsDec-24-2025, 17:51:58 GMT

Recently, a new trend of exploring sparsity for accelerating neural network training has emerged, embracing the paradigm of training on the edge. This paper proposes a novel Memory-Economic Sparse Training (MEST) framework targeting for accurate and fast execution on edge devices. The proposed MEST framework consists of enhancements by Elastic Mutation (EM) and Soft Memory Bound (&S) that ensure superior accuracy at high sparsity ratios. Different from the existing works for sparse training, this current work reveals the importance of sparsity schemes on the performance of sparse training in terms of accuracy as well as training speed on real edge devices. On top of that, the paper proposes to employ data efficiency for further acceleration of sparse training.

fast memory-economic sparse training framework, name change, sparsity scheme, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.58)

Add feedback